Energy Normalization in Automatic Speech Recognition
نویسندگان
چکیده
In this paper a novel method for energy normalization is presented. The objective of this method is to remove unwanted energy variations caused by different microphone gains, various loudness levels across speakers, as well as changes of single speaker loudness level over time. The solution presented here is based on principles used in automatic gain control. The use of this method results in relative improvement of the performances of an automatic speech recognition system by 26%.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملA Log-energy Scaling Normalization Scheme for Robust Speech Recognition
The log-energy parameter, as an auxiliary but influential feature, has been commonly used to augment Mel-frequency cepstral coefficients (MFCCs) to improve the recognition accuracy in automatic speech recognition (ASR). In this paper, a new and effective scaling approach named log-energy scaling normalization (LESN), which utilizes special nonlinear scaling functions on noisy speech data for lo...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملA Robust, Real-time Endpoint Detector with Energy Normalization for Asr in Adverse Environments
When automatic speech recognition (ASR) is applied to hands-free or other adverse acoustic environments, endpoint detection and energy normalization can be crucial to the entire system. In low signal-to-noise (SNR) situations,conventional approaches of endpointing and energy normalization often fail and ASR performances usually degrade dramatically. The goal of this paper is to find a fast, acc...
متن کاملSpeaker normalization for automatic speech recognition - An on-line approach
We propose a method to transform the on line speech signal so as to comply with the specications of an HMM-based automatic speech recognizer. The spectrum of the input signal undergoes a vocal tract length (VTL) normalization based on dierences of the average third formant F3. The high frequency gap which is generated after scaling is estimated by means of an extrapolation scheme. Mel scale c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008